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Abstract 

Combinatorial auctions are important as they enable 
bidders to place bids on combinations of items; compared 
to other auction mechanisms, they often increase the effi- 
ciency of the auction, while keeping risks for bidders low. 
However, the determination of an optimal winner combina- 
tion in combinatorial auctions is a complex computational 
problem. 

In this paper we (i) compare recent algorithms for win- 
ner determination to traditional algorithms, (ii) present 
and benchmark a mixed integer programming approach to 
the problem, which enables very general auctions to be 
treated efficiently by standard integer programming algo- 
rithms (and hereby also by commercially available soft- 
ware), and (Hi) discuss the impact of the probability dis- 
tributions chosen for benchmarking. 



1 Introduction 

Combinatorial auctions are important as they enable bid- 
ders to place bids on combinations of items; compared to 
other auction mechanisms, they often increase the efficiency 
of the auction, while keeping risks for bidders low [Rassenti 
et al, 1982; Rothkopf et al, 1995; Parkes, 1999; Wur- 
man, 1999]. The determination of an optimal winner com- 
bination in combinatorial auctions is an A/T-hard prob- 
lem [Rothkopf et al, 1995], which has recently attracted 
some research, e.g. [Rothkopf et al, 1995; Nisan, 1999; 
Fujishima et al, 1999; Sandholm, 1999], In this paper we 
look further into the topic. In particular, our contributions 
are: 

• The recent algorithms by Fujishima et al. [Fujishima et 
al, 1999] and Sandholm [Sandholm, 1999] are com- 



pared to traditional algorithms for the computationally 
identical problem of set packing, and hereby put into 
a proper computer science perspective. From this ex- 
ercise, we learn that many of the main features of re- 
cently presented algorithms are rediscoveries of tradi- 
tional methods in the operations research community. 

• We observe that the winner determination problem can 
be expressed as a standard mixed integer program- 
ming problem, cf. [Nisan, 1999; Wurman, 1999], and 
we show that this enables the management of very 
general problems by use of standard algorithms and 
commercially available software. This allows for ef- 
ficient treatment of highly relevant combinatorial auc- 
tions that are not supported by current algorithms. 

• The significance of the probability distributions of the 
test sets used for evaluating different algorithms is 
discussed and exemplified. Particularly we demon- 
strate that some of the distributions used for bench- 
marking in recent literature [Fujishima et al, 1999; 
Sandholm, 1999] can be efficiently managed with 
rather trivial algorithms. 

The paper is organized as follows. In Section 2 
we present a well-known set partitioning algorithm by 
Garfinkel and Nemhauser [Garfinkel and Nemhauser, 
1969], and discuss the current algorithms for optimal win- 
ner determination in the context of this algorithm. There- 
after, in Section 3, we observe that the winner determina- 
tion problem can be set up as a mixed integer programming 
problem and hereby be solved by standard algorithms and 
commercial software. In Section 4 some empirical bench- 
marking for standard mixed integer programming software 
is presented, and we discuss the significance of the prob- 
ability distribution of the test sets used for benchmarking. 
Finally, Section 5 concludes. 
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2 Recent winner determination algorithms 
and traditional algorithms for correspond- 
ing problems 

Before discussing how very general versions of winner 
determination can be solved by general purpose algorithms, 
we investigate the basic case in which a bid states that a 
bundle of commodities, q» = [qn , q i2) . - - , qik], <li € {0, 1} 
(k is the number of commodities) is valued at V{ € 5ft. Given 
a collection of such bids, the surplus maximizing com- 
bination is the solution to the integer programming prob- 
lem [Wurman, 1999; Nisan, 1999]: 

s.f-£?=iflfrBi<l,l< j<* 1 ; 

where n is the number of bids and Bi is a binary variable 
representing whether bid i is selected or not. 

We focus this presentation around a set partitioning al- 
gorithm introduced by Garfinkel and Nemhauser [Garfinkel 
and Nemhauser, 1969]. (For definitions of the set parti- 
tioning problem and related problems, cf. Balas and Pad- 
berg [Balas and Padberg, 1976], and Salkin [Salkin, 1975].) 
As pointed out by the originators, "the approach is so 
simple that it appears to be obvious. However, it seems 
worth reporting because it has performed so well". Indeed, 
some of the experimental results reported by Garfinkel and 
Nemhauser seem surprisingly good compared to recent ex- 
periments when taking the hardware performance at that 
time into account. 

The principles of the Garfinkel-Nemhauser algorithm are 
as follows. The algorithm creates one list per row (i.e. com- 
modity) and each column (set/bid) is stored in exactly one 
list. Given an ordering of the rows, each set is stored in the 
list corresponding to its first occurring row. Within each list, 
the sets are sorted according to increasing cost. The search 
for the optimal solution is done in the following way: 

1. Choose the first set from the first list containing a set 
as the current solution. 

2. Add (to the current solution) the first disjoint set from 
the first list — corresponding to a row not included in 
the current solution — containing such a disjoint set, if 
any. 

3. Repeat Step 2 until one of the following happens: 

(i) The cost for the current solution exceeds the cost 
for the best solution: this branch of the search can be 
pruned, (ii) Nothing more can be added: check if this 
is a valid solution/the best solution so far. 

4. Backtrack: Replace the latest chosen set by the next 
valid set in its list and go to Step 2. When no more 



sets can be selected from the list, back up further re- 
cursively. If no more backtracking can be done, termi- 
nate. 

Since the problem of Equation (1) is equivalent to the 
definition of set packing and the problems of set pack- 
ing and set partitioning can be transformed into each 
other [Balas and Padberg, 1976], the Garfinkel-Nemhauser 
algorithm can be used for winner determination in combi- 
natorial auctions. It is also clear that it is trivial to modify 
the algorithm to be suited for set packing without any mod- 
ification of the input; it is only a minor modification of the 
pruning/consistency test. Specifically, the sets need to be 
renamed as bids, cost has to be replaced by valuation, and 
item 3 needs to be replaced by: 

Repeat Step 2 until one of the following happens: 
(i) The value of the current solution can not ex- 
ceed the value of the best combination found so 
far: this branch of the search can be pruned, (ii) 
Nothing more can be added: check if this is the 
best solution so far. 

As seen from the above description, the currently best 
performing winner determination algorithm 1 , the CASS al- 
gorithm [Fujishima et ai } 1999], is apparently in major 
parts a rediscovery of the Garfinkel-Nemhauser algorithm. 
The main principles of both algorithms are to (i) put the bids 
in lists corresponding to the different commodities (called 
bins by Fujishima et al.), (ii) sort the bids in the list in 
some cost (valuation) related order, (iii) do pruning when- 
ever the current combination cannot be better than the best 
one found so far, and (iv) do standard backtracking. There 
are essentially two significant differences between CASS 
and the Garfinkel-Nemhauser algorithm: (i) caching of par- 
tial search results, and (ii) improved pruning. 

The caching, normally referred to as dynamic program- 
ming, is done by storing partial search results in a table. 
This is reported to often pay off, as many partial allocations 
share the same "rest term" (i.e. remaining unassigned com- 
modities) [Fujishima et aL, 1999]. The cache can hereby 
also be used for pruning; whenever the "rest term" is a sub- 
set of an already cached "rest term" and the surplus of the 
current allocation plus the surplus of the cached allocation 
is smaller than the surplus of best combination found so far, 
this branch of the search can be pruned. 

Compared to the simple pruning described in the 
Garfinkel-Nemhauser algorithm, Sandholm's algo- 
rithm [Sandholm, 1999] and the CASS algorithm [Fu- 
jishima et al, 1999], use a more sophisticated technique 
which essentially is the ceiling test [Salkin, 1975]. 

'At the presentation of CASS at IJCAI 1999 in Stockholm, the CASS 
algorithm [Fujishima et al., 1999] was reported to outperform Sandholm's 
winner determination algorithm [Sandholm, 1999] by approximately two 
orders of magnitude for the distributions tested. 
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3 Combinatorial auction winner determina- 
tion as a mixed integer programming prob- 
lem 

In this section we describe how very general optimal 
winner determination problems can be formulated as a 
mixed integer problem. For reading on mixed integer pro- 
gramming (ML?) in itself and its relation to set packing 
etc., we refer to the literature on combinatorial optimiza- 
tion and operations research, e.g. [Balas and Padberg, 1976; 
Garfinkel and Nemhauser, 1969; Salkin, 1975]. 

As discussed below, by properly formulating the prob- 
lem, we get a large number of very attractive features. These 
include: 

• The formulation can utilize standard algorithms and 
hence be run directly on standard commercially avail- 
able, thoroughly debugged and optimized software, 
such as CPLEX. 2 

• There may be multiple units traded of each commodity. 

• Bidders are not restricted to bid for integer amounts. 

• Bidders can construct advanced forms of mutually ex- 
clusive bids. 

• Sellers may have non-zero reserve prices. 

• There need not be a distinction between buyers and 
sellers; a bidder can place a bid for buying some com- 
modities and simultaneously selling some other com- 
modities. 

• Complicated recursive bids with the above features can 
be expressed. 

• Very general constraints can be expressed. 

• Settings without free disposal (for some or all com- 
modities) can be managed. 

It should be pointed out that CASS and Sandholm 's al- 
gorithms can handle some of the generalizations above. For 
example, mutually exclusive (XOR) bids are easily for- 
mulated by adding dummy commodities (e.g. "a XOR 
b (a Ac) V (6 A c)"), but such transformations often give 
rise to a combinatorial explosion of bids [Nisan, 1999]. 

It is noteworthy that the formulation of Equation (1) can 
be run directly by commercially available software, and in 
Section 4 some empirical comparison between recent algo- 
rithms and the standard CPLEX software is shown. (Note 
that the formulation of Equation (1) only is applicable to 

2 See www . cpl ex . com 



the simple case discussed in Section 2 and that the lin- 
ear programming form of the winner determination prob- 
lem in general will look different.) However, the possibil- 
ity of using off-the-shelf software has been overlooked and 
current benchmarked algorithms [Fujishima et aL } 1999; 
Sandholm, 1999] are written from scratch. (The formula- 
tions of this problem given by Rothkopf et al. [Rothkopf 
et ai, 1995], Fujishima et al. [Fujishima et al, 1999], and 
Sandholm [Sandholm, 1999], are however not suited as di- 
rect input for standard software.) 

The formulation used here conforms with the formula- 
tions by Wurman [Wurman, 1999] andNisan [Nisan, 1999]. 
Compared to these formulations, we observe that much 
more general combinatorial auctions than the ones treated 
so far can be expressed as mixed integer problems, and that 
they can be successfully managed by standard operations 
research algorithms and commercially available software. 

With standard MIP methods, any constraint that can be 
expressed in linear terms in the involved variables can be 
used when defining a bid. Thus in the general case, the ob- 
jective function will consist of terms representing the value 
of a (certain part of a) bid, times the degree to which it is 
accepted. That is, we need not restrict the auction to only 
binary choices. Correspondingly, the feasibility constraint 
need in the general case not be restricted to the cases where 
there is only one unit for sale of each commodity, free dis- 
posal can be assumed, etc. 3 It is also possible to use the MIP 
approach for the minimal winning bid problem [Rothkopf 
et al, 1995], i.e. the problem of replying to the question 
"If I request these and these amounts of these and these 
commodities, how much do I have to pay to get my bid 
accepted?". 

Clearly, requiring that each bidder should give its bids 
as terms to be added to the objective function and the fea- 
sibility constraints together with a number of constraints 
may be a too heavy burden put on a bidder, cf. [Nisan, 
1999]. Therefore it makes sense to construct different high 
level support for expressing bids and using the combinato- 
rial auction. 

4 Empirical benchmarking 

In this section we give some empirical data in order to 
compare the new approach to optimal winner determination 
based on standard MIP (and consequently tested with off- 
the-shelf software) to current highly specialized approaches 
in cases where these can be applied. 

It is generally recognized that it is most unfortunate that 
real- wo rid test data are not available. As long as no such 

3 The free disposal assumption (i.e. that non-allocated resources can 
be disposed without a cost) typically has a very drastic impact on "any- 
time behavior"; without the free disposal assumption, finding a feasible 
allocation is significantly harder. 
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test data is available, it is perhaps reasonable to try dif- 
ferent types of distributions and try to identify what types 
distributions are "hard" and which ones are "easy" for dif- 
ferent types of algorithms. The empirical benchmarks are 
performed on the same test data as was given by Sand- 
holm [Sandholm, 1999] and Fujishima et al. [Fujishima et 
ai, 1999]. We use these tests not because we are convinced 
that they are the most relevant, but because they are the only 
ones for which we have data for competing algorithms. Our 
experience so far is that (seemingly small differences of) 
the bid distributions chosen have an extreme impact on the 
performance of the algorithms. It is theoretically shown 
that (unless V = MV) no efficient general optimal algo- 
rithm can be constructed [Rothkopf et al, 1995] (not even 
if a certain approximation error can be tolerated [Sandholm, 
1999, Proposition 2.3]). Therefore one should be very care- 
ful when arguing about the practical usefulness of an algo- 
rithm without access to real data. 

The software used is CPLEX version 6.5. The hardware 
setup of the experiments has been one standard uniproces- 
sor 550MHz PC with 128Mb of RAM memory. 4 The time 
required for loading the test data into memory has not been 
included in the results below. The time reported is "wall 
time", i.e. an upper bound on processor time used. 

For the sake of reproducibility, all test data and programs 
required for generating the CPLEX input format, as well as 
detailed descriptions of the CPLEX settings used is avail- 
able from the Internet. 5 

Figure 1 to Figure 5 show the results of the respective 
tests. For each distribution, 10 instances have been tested, 
which is sufficient for obtaining a basic illustration. The 
instance sizes have been selected to match the sizes tested 
in the literature and/or to give reasonable computation time. 

During the search for the optimal solution, CPLEX re- 
ports the best solution found so far (i.e. works as what 
is sometimes referred to as an anytime algorithm). In the 
figures, the curves denote the surplus of the currently best 
solution normalized by the surplus of the optimal solution. 
For each moment in time, the worst, average and best so- 
lution is plotted. The point at which optimality is verified 
is marked as a special point (> and < for minimum and 
maximum time respectively, O for the average, and * for all 
instances other than min and max). 

We have tested the following distributions. 

Random [Sandholm, 1999] 

Definition: For each bid, pick the number of commodities 
requested randomly from 1 to the number of commodities 
in the market. Randomly choose the actual commodities 

"Note that the software we have used also is available in versions for 
parallel execution. Hence, by using a high performance parallel platform, 
performance can be improved significantly if required. 

5 See www . docs . uu . se/~tein/lPForComb . html. 



requested without replacement. Draw a random integer val- 
uation between 1 and 1000. (For this and all other distri- 
butions we have used integer valuations for simplicity of 
parsing etc. However, our experience is that changing to 
real numbers increases computation time by less than 10%; 
a negligible number for our purposes.) 
Results and discussion: CPLEX determines the optimal 
winner efficiently, see Figure 1 ; the timings are superior to 
those presented by Sandholm [Sandholm, 1999]. Further- 
more, we note that this distribution is very simple in the fol- 
lowing sense: Since the price of a bid is not weighted by the 
number of commodities, small bids will be dominating. A 
simple preprocessing, column dominance checking [Salkin, 
1975], will decrease the problem size to a simple degen- 
erate case. On a high level: for large test sets, the proba- 
bility that any bid requesting more than one commodity is 
in the optimal combination is close to nil. (Hint: The ex- 
pected summed valuation of two bids requesting only one 
commodity is twice the expected valuation of a bid request- 
ing two commodities, and so on.) We have implemented 
simple algorithms for doing column dominance checking 
and they have, for all instances with a significant number 
of bids, been able to reduce the number of bids to the num- 
ber of commodities (and obtain the optimal solution without 
any further processing). As an example we can — instead of 
using CPLEX— simply reduce 180000 bids (and 30 com- 
modities) to 30 bids (and also find the optimal combination) 
in 4s with a non-optimized Java implementation of a heuris- 
tic column dominance checking algorithm on an ordinary 
550MHz PC. Clearly, this suggests that a trivial approxi- 
mate algorithm should be used for this distribution: Select 
the highest bid — requesting only one commodity — for each 
commodity. For all bid sets we have tried (of any significant 
size), this would have resulted in an optimal solution. 

The special character of the computation — only one 
"step" in the surplus of the best and worst solutions and 10 
steps for the average solution — is explained by the nature of 
the distribution. From the above reasoning, it is easy to see 
that one can establish one price per commodity supporting 
the optimal allocation. When this holds, an LP solution to 
the problem is the optimal allocation [Nisan, 1999]. The 
principle of CPLEX (which is to first establish an LP so- 
lution and then do a branch and bound) then explains this 
behavior; 

The characteristics of the weighted random distribu- 
tion [Sandholm, 1999] are similar both in terms of compu- 
tation time as well as in the possibility to use a trivial highly 
efficient approximate algorithm. (In this case the trivial al- 
gorithm is to simply select the bid with the highest valua- 
tion.) 
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Random Distribution. 400 items, 2000 bids. 




Fig U re 1 . The Random distribution for 2000 bids and 400 
commodities. For this distribution the time for obtaining 
the optimal combination and being able to guarantee this 
optimality is the same for most tested instances. Worst case 
is in the area of 160s. In comparison, Sandholm's algorithm 
is reported to manage 1000 bids and 400 commodities in 
approximately 6000s on average. 



Decay [Sandholm, 1999]. 

Definition: Make the bid request a random commodity. 
Then repeatedly add a new commodity with probability a 
until an item is not added or the bid requests all commodi- 
ties in the market. Pick a random integer valuation between 
1 and 1000 and multiply by the number of commodities re- 
quested. 

Results and discussion: CPLEX performs well compared 
to Sandholm's algorithm [Sandholm, 1999], and rather 
large bid sets can be efficiently managed, see Figure 3. 

Decay Distribution, alpha=0.55, 200 items, 10000 bids. 
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Uniform [Sandholm, 1999] 1 

Definition: Draw the same number of randomly cho- 
sen items for each bid. Pick an integer valuation from 
[500.. 1500] and multiply by the number of commodities. 

Results and discussion: CPLEX performs well compared 
to Sandholm's implementation [Sandholm, 1999], cf. Fig- 
ure 2. Still this appears to be a "hard" distribution for 
CPLEX. 

Uniform Distribution. 100 items, 500 bids. 

..... ........ ..<> ., | o 
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Figure 2. The Uniform distribution for 500 bids and 100 
commodities, where each bidder bids for three commodi- 
ties. The difference between the different instances vary 
significantly, and the execution times vary from a few sec- 
onds to around 470s. Sandholm's algorithm is reported to 
manage 150 bids and 100 commodities in approximately 
40000s on average. (For this latter instance size, CPLEX 
finds the verified optimal solution in around 400ms.) 



0 100 200 300 400 500 600 

Time (s) 

Figure 3. The Decay distribution for 10000 bids and 200 
commodities with a = 0.55. Worst case is around 580s. 
For the same distribution with 200 bids and 200 commodi- 
ties the execution time with Sandholm's algorithm is re- 
ported to be around 40000s. 



Binomial [Fujishima et aL, 1999] 

Definition: The probability distribution for a bid requesting 
n commodities of k commodities in the market is 

/(n)=p"(l-p) fc -"^; ), 

with p = 0.2. An integer valuation is drawn from 500 to 
1500 and multiplied by n. 

Results and discussion: Under the assumption that the 
benchmarks given by Fujishima et al. [Fujishima et al., 
1999] denotes the time required for finding a verified op- 
timal solution 6 , the CASS implementation is faster than 
CPLEX (around 20 times) for 30 commodities and 3000 
bids. Still CPLEX can manage rather big bid sets efficiently. 
(Note that in Figure 4, 30000 bids are used.) 

The CASS algorithm was also reported to be tested on 
1500 bids and 150 commodities [Fujishima et aL, 1999]. 
However, a later version of the paper suggests that there was 

6 It is an important distinction between the time required for finding an 
optimal solution and verifying that it is so. We have not been able to tell 
from the paper by Fujishima et al. [Fujishima et al., 19991 which of the 
two they present. 



0-7695-0625-9/00 $10.00 © 2000 IEEE 



Binomial Distribution. p=0.2. 30 items, 30000 bids. 




Figure 4. The Binomial distribution is shown for 30000 
bids and 30 commodities. Here the relative surplus raises to 
the area of 97% in a few seconds. The optimal combination 
is normally found after between 30 and 150s. The worst 
case for finding a guaranteed optimal solution is around 
240s. (For smaller instances, 3000 bids, the timing reported 
for CASS are approximately 20 times better than the ones 
ofCPLEX.) 

a typo in the published version. Instead, the number of bids 
seem to be 15000. While CPLEX can manage 1500 bids 
efficiently, it fails to handle 15000 bids. On the other hand, 
it is interesting to note that this problem instance, which at 
a first glance appears to be "hard", actually turns out to be 
"easy". Indeed, there is a simple algorithm which from our 
experience outperforms CASS. 

The reason that the problem is simple is, modulo some 
details, that as the number of commodities increase, the 
probability that two bids are conjunct increases. Therefore, 
the expected number of bids in the optimal solution is small 
(two or three). 

A quickly programmed algorithm, enumerating all non- 
colliding combinations (i.e. it uses neither pruning nor 
ranking heuristics) finds the verified optimal solution in ap- 
proximately 15 s on a standard 450MHz PC with 256Mb 
RAM. This is around 15 times faster than CASS, under the 
assumption that Figure 3 in the Fujishima et al. paper [Fu- 
jishima et al, 1999] denotes 15000 bids. (If the number of 
bids is 1500 as stated in the paper the difference is of course 
larger.) This algorithm works as follows: 

1. For each bid, construct a list of all non-colliding 
bids. Assuming a certain probability distribution and 
a certain number of commodities, the time for this is 
quadratic in the number of bids. 

2. Combine bids in a depth first manner. No bid com- 
parisons are made, instead we use the lists of non- 
colliding bids. As we combine bids, we also com- 
bine their lists by taking their intersection. The short 
length of the lists of non-colliding bids gives the fast 
execution. (For an input of 15000 bids, 150 commodi- 
ties, and p — 0.2, a typical number of valid combina- 



tions of two bids is 250000, three bids 40000, and four 
bids 200.) 

Of the 15s spent on each test set, around 14s were spent 
in step 1. Hence, neither pruning nor ranking heuristics can 
improve this algorithm significantly here. 

Exponential [Fujishima et al, 1999] 

Definition: The probability distribution is defined as 
/ e (n) = Ce" n / p (with C is assumed to be implicitly de- 
fined from Yln=i fei n ) — 1> where k as before is the num- 
ber of commodities). The valuation is an integer, rectangu- 
larly drawn from [500.. 1500] and multiplied by the number 
of requested commodities. 

Results and discussion: CASS appears to be around a fac- 
tor of two faster than CPLEX for 3000 bids and 30 com- 
modities, cf. Figure 5. In an extended on-line version of 
their paper, Fujishima et al. report that the CASS algorithm 
finds the optimal solution (though it is not clearly stated that 
optimality is verified) in around 550s for 4500 bids and 45 
commodities. CPLEX finds the verified optimal solution 
for the corresponding instance size in around 8s on average 
(with small variation). 

Exponential Distribution. p=5, 30 items, 3000 bids. 




Time (s) 



Figure 5. The Exponential distribution for 3000 bids and 
30 commodities. The verified optimal solution is found in at 
most around 2.3s. The corresponding timing for the CASS 
algorithm is slightly above Is. 

4.1 Discussion 

In sum, CPLEX performs very well for many of the 
tested distributions. Under most reasonable assumptions 
of the collection of bids, the computation time is rela- 
tively small. Furthermore, if the bids are submitted in se- 
quence under some amount of time, CPLEX can do sub- 
sequent searches starting from the best solution found up 
to the point of the arrival of a new bid. For the harder 
distributions — which hence are of main interest — CPLEX 
is around five orders of magnitude faster than Sandholm's 
algorithm. As CASS has been reported to outperform Sand- 
holm's algorithm by around two orders of magnitude for 
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the harder distributions, there is an indication that CPLEX 
also is faster than CASS is here. Still, as a consequence 
of the AfP-hardness of the problem (and as indicated by 
our empirical study), nor CPLEX nor any other known al- 
gorithm is a silver bullet for this problem. Even though 
CPLEX outperformed the recent algorithms for the sparser 
distributions (sparse in the sense that few bids collide), im- 
plicit enumeration algorithms — for example in the spirit 
of the Nemhauser-Garfinkel algorithm, cf. Section 2 — 
becomes highly competitive for denser distributions. Again, 
we should bare in mind that CPLEX is a general-purpose 
software and that the comparisons only are performed for 
the very special cases where Sandholm's algorithm and the 
CASS implementation can be used. 

From our experiments with different families of algo- 
rithms it is clear that if the probability distribution is known 
to the auctioneer, it is sometimes able to construct algo- 
rithms that capitalize significantly on this knowledge. Our 
main conclusion so far is that it is very important to ob- 
tain some realistic data and investigate whether it has some 
special structures that can be utilized by highly specialized 
algorithms (assuming that standard algorithms fall short on 
practically relevant instances). 

The three examples of the Random distribution, the 
Weighted random distribution, and the Binomial distribution 
with many commodities are three very illustrating examples 
of distributions that at a first glance may seem "hard" but 
turn out to be rather "easy". As seen above, it is easy to 
construct very simple yet very efficient algorithms for these 
special cases. 

The construction of realistic probability distributions 
based on some of our main application areas — such as elec- 
tronic power trade and train scheduling markets — together 
with some reasonable agent strategies in certain attractive 
combinatorial auction models, such as /Bundle [Parkes, 
1999] or AkBA [Wurman, 1999] is important future work. 
However, one brief reflection on realistic probability distri- 
butions can be given already here; there are good reasons to 
believe that real distributions will be much harder than the 
ones described above. For example, if the /Bundle auction 
is used and we have agents with the strategies that they only 
bid e above the current prices (or taking the "e-discount") 
we will have a very "tight" distribution; most bids are part 
of some combination which is close to optimal This makes 
pruning drastically harder. For example, we tried the Uni- 
form distribution, but with 100000 added to each valuation 
(i.e. the valuations vary in only 1%), and this increased 
the execution time of CPLEX by some factor 20. But there 
are also other aspects of the hardness of real- wo rid distri- 
butions [Nisan, 1999]. Again this calls for gathering of 
real -world (or at least derivation of realistic) data, before 
focusing on heavily specialized algorithms. 



5 Conclusions 

In this paper we discussed important computational as- 
pects of optimal winner determination in combinatorial auc- 
tions. We have compared recent approaches to this problem 
with a traditional approach to set partitioning, which can be 
used for optimal winner determination. The main conclu- 
sion of this comparison is that many of the features of cur- 
rent algorithms are rediscoveries of well-known methods. 

We then discussed how mixed integer programming 
could be utilized to manage more general problems than the 
ones managed by the recent highly specialized algorithms, 
and that commercially available software performs excel- 
lently for many problem instances. We believe that this can 
enable the application of combinatorial auctions to applica- 
tions to which there are not yet any winner determination 
algorithms available. The approach introduced here can be 
used in combination with many different forms of combi- 
natorial algorithms and is of interest regardless of whether 
bids are sealed or open, whether the auction is iterative or 
one-shot, and whether the computation is centralized or de- 
centralized (e.g. let the bidders suggest better solutions). 

We also discussed and exemplified the enormous impact 
the probability distribution of a given test has on the compu- 
tation time. It was shown that some of the distributions used 
in benchmarking current algorithms allow for very simple 
and efficient algorithms that take advantage of the structure 
of these distributions. 

In summary our conclusions are that: 

• much can be gained by capitalizing on the achieve- 
ment made in operations research and combinatorial 
optimization, 

• more work is needed on the study of what real-world 
(or at least realistic) instances may look like, and 

• highly specialized algorithms are mainly of interest 
(from an e-commerce point of view) for real-world in- 
stances for which standard algorithms fall short. 

Not only is it useful for electronic commerce to take 
advantage of existing achievements in operations research 
and combinatorial optimization, but it is also a concern 
that introducing "new" algorithms while overlooking ex- 
isting theory is scientifically problematic. Furthermore, it 
is actually debatable if developing "new" set packing al- 
gorithms benchmarked on arbitrary distributions is a rele- 
vant e-commerce research activity. On the other hand, (i) 
gathering real world distributions (or derive realistic ones 
from realistic agent preferences, agent strategies and mar- 
ket mechanisms), (ii) investigating state of the art of oper- 
ations research and combinatorial optimization algorithms 
for these settings, and (iii) developing special purpose algo- 
rithms where needed, definitely is. 
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A final — more fundamental issue — is whether the ap- 
proach used in this and other papers is at all useful. One 
view is that the bids should be restricted in such a way that 
polynomial time algorithms can be used to find the optimal 
allocation [Rothkopf et al, 1995]. Another view is that it 
is unnecessary to restrict the bids; if the bids happen to be 
restricted in the ways Rothkopf et al. suggest [Rothkopf et 
al, 1995], then an appropriate algorithm will rapidly find 
the optimal solution anyway [Nisan, 1999]. We very much 
agree with this latter view. Our arguments for this are as 
follows. If it is the case that there only are a few impor- 
tant dependencies (which is required for implying that the 
restrictions proposed by Rothkopf et al. do not decrease the 
surplus significantly), most bids will reflect these dependen- 
cies. The ones that reflect other (less important) dependen- 
cies will be non-competitive and pruned from the search 
at an early stage. Furthermore, we argue that if there are 
so many dependencies (of comparable importance) that the 
winner determination indeed is computationally intractable, 
then it seems safe to conjecture that it is always better to al- 
low bids on all combinations and use an approximate al- 
gorithm, than to restrict the bids. The following heuris- 
tic algorithm supports this point of view. First identify 
the most important dependencies (i.e. the ones that would 
have been used if we would have taken the restricting ap- 
proach by Rothkopf et al. [Rothkopf et al, 1995]), then as 
first heuristics in the search, only consider bids that fulfill 
the restrictions (and let this heuristics be common knowl- 
edge). Then once the optimal solution of this restricted bid 
set has been found, add all other bids and search until a cer- 
tain dead-line is met, and take the best solution found up 
to that point. Under the assumption that the time for ex- 
tracting only the bids fulfilling the restrictions is negligible 
(it must indeed be small for the restricting approach to be 
successful — otherwise checking the validity of bids will be 
too hard), this approach can safely be expected to always do 
as least as well as the approach of only treating the restricted 
bid space. 

Rothkopf et al. [Rothkopf et al, 1995] discuss another 
attractive idea; to let bidders suggest winning allocations. 
As each bidder will prioritize its own bids heavily in such 
a search (in order to find a winning combination of which 
its own bids is a part), this may serve as a very efficient 
parallelization of the search and utilization the computa- 
tional power of the participating agents. In auctions with 
high values in which all bidders can propose better combi- 
nations, and in which highly optimized software (as the one 
described in this paper) is used, we can therefore expect 
that very large bid spaces can be searched in reasonable 
time. If — despite good heuristics, considerable computa- 
tional power (optionally with the bidders participating in the 
search), and highly efficient algorithms (such as CPLEX) — 
the problem of finding a good solution still is computation- 



ally intractable, then it is probably generally very challeng- 
ing to construct a simple and computationally efficient auc- 
tion with any significant economic efficiency. 
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Integer programs are usually NP-hard, and the SPP is no exception. But it sometimes happens that the vertices of the polyhedron 

s 

P(M) = { x : )V tn g x j f 1 V ilL, x , > 0 V j\J) all have integer coordinates, which indicates that constraint x 

redundant and that the corresponding SPP can be solved as a linear program. In such instances, a combinatorial method can usually 

be found that solves the given problem even more efficiently than linear programming. Nevertheless, the connection with linear 

programming is a useful one, because it supplies dual variables that can be interpreted as prices for the individual items on the list 

L. A variety of conditions are known to guarantee that P(M) has Integer vertices, and a polynomial-time algorithm exists for 

deciding whether a given matrix M with elements m fl from {0,1,-1 } generates integer vertices. Most texts on integer prog 

devote at least a chapter to this aspect of the subject 

Generally speaking, the conditions guaranteeing that PiM) has only integer vertices are unduly restrictive, in the sense that the 
more interesting problems tend to generate constraint matrices M for which PiM) possesses both integer and noninteger vertices. 
Integer programming has come a long way since the dawn of the computer age, and is now able to handle surprisingly intricate 
problems. In particular, Brenda Diet-rich — one of the organizers of the IMA workshop — described the results her group at IBM 
has been obtaining with column-generation methods. 

Column generation is a technique for solving linear programs with so many variables — each one corresponding to a column of 
the constraint matrix M — that M overflows main memory. For a matrix M with n rows, the technique begins with the generation 
of an n In submatrix, and the selection from its columns of an optimal basis for the column space of the submatrix. The unused 
columns of the submatrix can then be discarded and replaced by nnewly generated columns of M\ a new optimal basis is then found 
within the modified submatrix. If each step is performed judiciously, each successive optimal basis will strictly excel its 
predecessor, and the process wilt terminate in an optimal basis for the entire column space of M t having generated only relatively 
few of the many columns of M. Dietrich described a few of the successes her group at IBM has had with this technique. 

Special-purpose commercial software packages exist for solving auction-related integer programs. Logistics.com 's OptiBid 
has been used in situations in which the number of bidders varied from 12 to 350, the average being about 120. The number of items 
(typically lanes) has ranged from 500 to 10,000. SAITECH-INC's bidding software SB ID, likewise based on integer programming, 
is said to work on problems of similar size. Combinatorial auctions seem likely to occupy the attention of auction theorists for some 
time to come. 
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